Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce READ COMMITTED isolation when using mysql #15714

Merged

Conversation

SamWheating
Copy link
Contributor

@SamWheating SamWheating commented May 7, 2021

Closes: #15559

In the investigation in the above issue, we found that MySQL's default isolation level ended up causing some really weird issues when running multiple schedulers.

This ended up causing skipped tasks while backfilling and caused DAGs using depends_on_past to completely deadlock.

By enforcing READ COMMITTED when using MySQL we should be able to avoid these issues entirely. I don't expect that this will introduce any issues when using MySQL 5.x and/or a single scheduler, as READ COMMITTED is the default isolation level used by PostgreSQL.

I haven't had a chance to actually test this implementation as we fixed the errors in our production Airflow environment by updating the isolation level on the database itself. If you'd like, I can set replicate the issue with a default MySQL DB and ensure that this fix works as expected.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label May 7, 2021
@ashb
Copy link
Member

ashb commented May 7, 2021

@potiuk Possibly one more for 2.0.3

@potiuk
Copy link
Member

potiuk commented May 7, 2021

@potiuk Possibly one more for 2.0.3

Yep. I see.

@ashb
Copy link
Member

ashb commented May 7, 2021

Could you split the dialect change to a separate PR please?

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label May 7, 2021
@github-actions
Copy link

github-actions bot commented May 7, 2021

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions
Copy link

github-actions bot commented May 7, 2021

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the workflow link to check the reason.

@SamWheating SamWheating force-pushed the sw-force-read-committed-with-mysql branch from 965b08f to 79c0ff9 Compare May 7, 2021 14:22
@ashb
Copy link
Member

ashb commented May 7, 2021

Looks like we test the engine args @SamWheating -- could you fix the tests?

@potiuk
Copy link
Member

potiuk commented May 7, 2021

Yep. I'd love to merge it and include in 2.0.3

@SamWheating
Copy link
Contributor Author

SamWheating commented May 7, 2021

Tests should be fixed now 🤞, they actually surfaced an indentation error which caused the isolation level to remain unset if connection pooling was disabled.

@potiuk
Copy link
Member

potiuk commented May 7, 2021

IT WORKS !

@potiuk
Copy link
Member

potiuk commented May 7, 2021

Still, one more test to fix :(

@github-actions
Copy link

github-actions bot commented May 7, 2021

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

github-actions bot commented May 7, 2021

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@SamWheating
Copy link
Contributor Author

SamWheating commented May 8, 2021

😩 my bad, I forgot to run the tests locally with all backends. Should be good to go now 🤞

Update: Got some other CI failures now, but I don't think they're related to my changes, I'll rebase on a fresh master and see if it fixes anything 🤔

@SamWheating SamWheating force-pushed the sw-force-read-committed-with-mysql branch from 3e9db43 to ef6e45d Compare May 8, 2021 17:48
@potiuk
Copy link
Member

potiuk commented May 8, 2021

Update: Got some other CI failures now, but I don't think they're related to my changes, I'll rebase on a fresh master and see if it fixes anything

It should. We had a few problems with failing master due to too hasty merges and some selective checks slipping through the cracks, but this is all solved now.

@potiuk
Copy link
Member

potiuk commented May 9, 2021

Random failure. Merging :)

@potiuk potiuk added this to the Airflow 2.0.3 milestone May 9, 2021
@potiuk potiuk merged commit 231d104 into apache:master May 9, 2021
potiuk pushed a commit that referenced this pull request May 9, 2021
* Enforce READ COMMITTED isolation when using mysql

* Fixing up tests, removing indentation

* Fixing test

(cherry picked from commit 231d104)
@alippai
Copy link

alippai commented May 14, 2021

I'm confused a little bit about this change. If decreasing the isolation level helps, then necessary COMMITs are missed. If anything, changing the isolation level to SERIALIZABLE should be the solution. Were these solutions considered? Airflow is a scheduler, dropping transaction safety upon concurrency issues simply sounds like a super-weird solution, are we sure about this?

@potiuk
Copy link
Member

potiuk commented May 16, 2021

I think the change is needed mostly because of the locking behavior in those different isolation levels.

We are using row-level locking in MySQL 8 with HA scheduler and if you look at the behaviour of the default REPEATABLE READ if some search criteria are used when locking the row, MYSQL will also lock the gaps between locked rows matching the criteria not only the locked rows, which likely causes the deadlocs mentioned by @SamWheating .

In case of READ COMMITTED the gaps between rows are never locked. There is a problem that inserts could create phantom rows in this case, but this is really not the case in our way of using the locks (they are read/update locks, not inserts). Even if more rows are initially locked during the WHERE clause, after the Where clause is evealuated and rows are selected for locking, for the rows that are not selected, the locks are freed.

SERIALIZABLE behaves more like REPEATABLE READ but it converts the selects into SELECT FOR SHARE so those are "red allowed /write locked". But I think this does not solve the problem because in case of HA schedulers we have several schedulers trying to run SELECT FOR UPDATE on the same table with some selection criteria, which means that the deadlocks are still possible.

As far as I see the change is sound :)

More information here: https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html

@SamWheating @ashb - did I get this correctly ?

@alippai - does it sound convincing?

@SamWheating
Copy link
Contributor Author

Hey @alippai, thanks for bringing this up. I am in agreement with Jarek's comment and don't expect this to introduce any new issues. However if there's something I've missed then I am happy to reevaluate these changes.

In general, if you all feel that this change is risky or too opinionated (as we're potentially overriding people's mySQL configuration without making it super clear), we could revert and approach this problem differently, either by:

  • Mentioning this in the docs and telling people to set this isolation at their database level.
  • Moving the isolation_level out to a config option, similar to the connection pooling parameters.

Thoughts?

@alippai
Copy link

alippai commented May 16, 2021

I think what you wrote is correct, but this change is a little bit different. What you describe would be achieved by starting a new transaction with the selected isolation level, using exclusive locks, submitting COMMIT between them or killing one of the transactions in the deadlocked HA case (if that's really a concurrency issue).

The change in this PR affects all the Airflow operations. It might be a working workaround if the transactions are not committed / rolled back in a timely manner, but I'm not sure that avoiding deadlocks by undocumented implicit assumptions on how the different parties use all the tables is the right one. Explicit exclusive locks on the problematic transactions tend to be more maintainable (unless we see major performance degradation)

The issue above doesn't happen with INSERTs, however I'd assume INSERTs are happening to the same table from different scheduler instances (or UI / API) and this change may have a side-effect for those.

@alippai
Copy link

alippai commented May 16, 2021

I agree that this still is a valid solution and opinionated is good (as you know the usage and I don't).
The important point is on that I don't know the usage and if I'd have to contribute to the scheduler it could blow into my face (as MySQL is not protecting me, I have to be aware of the different parties doing the exact queries).

The change is right, I would just reduce the scope of these "unsafe" regions by starting select transactions only with the READ COMMITED isolation level or putting exclusive write locks into the current ones (which should likely prevent deadlocks if no other party is allowed to use the resource at all).

@alippai
Copy link

alippai commented May 16, 2021

Docs already cover that we are using SKIP LOCKED and it needs a modern DB. SKIP LOCKED is also a great tool for preventing deadlocks with the smallest scope possible. https://github.com/apache/airflow/blob/master/docs/apache-airflow/concepts/scheduler.rst#database-requirements

Disregard my concerns, if these are wrong solutions which wouldn't help your issue, I just didn't see the discussion about this yet and felt like we brining the nukes instead of the scalpel :)
The fact you took the time to answer in detail already reassures me that this is not something what just "slipped through"

@potiuk
Copy link
Member

potiuk commented May 16, 2021

I think what also matters is how SQLAlchemy works and how we are using it.. I am not really concerned too much about changing the default to READ_COMMITTED for all transactions, because:

a) we have very short transactions usually and we are usually working on the same set of rows/tables as we retrieve in the first query in the transaction

b) SQL Alchemy will retrieve the rows we are working on and store them as objects in memory and only when we flush them /commit transaction SQL alchemy will merge the change back.

c) Scheduler works (in 2.0) in a small "tight" loops. Basically, it will retrieve N records (say first 100 matching the crirteria), lockng them and then only that thread of that scheduler will perform any changes to those rows and related data - merging them back). Then it commits and goes back and retrieves the next 100 matching rows. So the contention and parallel access to same rows is not really possible (under normal circumstances).

Also the scheduler (which is the important one) uses indeed SKIP LOCKED (but I believe locking the same Gaps by different schedulers might cause the deadlocks in some scenarios even if SKIP LOCKED is used).

@ashb -> I might not have the whole picture so maybe you can comment here. BTW. I think it might be one of interesting topics of the talk of yours at the Summit :) https://twitter.com/AshBerlin/status/1393861492282429443

@saolof
Copy link

saolof commented Jun 17, 2024

Reading this in hindsight, this change is weird.

Postgres uses read committed as the default primarily because of backwards compatibility, it throws exceptions at higher isolation levels if it detects a write conflict, and the race detector was added more than a decade after the initial release because it required a huge amount of work. The postgres read committed implementation technically provides MAV (it guarentees a constant view of the database within any single statement) which is stronger than mysql's read committed (which guarentees only that each row is constant within any statement, so a mysql select can see some rows from before a multi-row update and some from after the update).

Mysql picked repeatable read isolation as the default because its read committed implementation, while technically being ANSI-compliant, offers much weaker consistency guarentees than other databases, so you should bump it up one level up from what you would use in other DBs.

If you see concurrency issues at repeatable read, the way to fix them should be to bump the isolation level to SERIALIZABLE in all integration tests against postgres and fix all 40001 exceptions that postgres gives you, even if it that requires changing the schema to facilitate it. The actual problem here seems to be that mysql is bad at propagating error codes from a detected race condition to the user so that airflow used to continue after a transaction was aborted, while postgres + sqlalchemy will give you an explicit exception at higher isolation levels. Fixing the errors raised by postgres should also fix the concurrency issue that caused mysql to ignore transactions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

airflow dag success , but tasks not yet started,not scheduled.
5 participants